Container-as-a-service (caas) controller for monitoring clusters and implementing autoscaling policies

ABSTRACT

Embodiments described herein are generally directed to a controller of a managed container service that facilitates autoscaling based on bare metal machines available within a private cloud. According to an example, a CaaS controller of a managed container service monitors a metric of a cluster deployed on behalf of a customer within a container orchestration system. Responsive to a scaling event being identified for the cluster based on the monitoring and an autoscaling policy associated with the cluster, a BMaaS provider associated with the private cloud may be caused to create an inventory of bare-metal machines available within the private cloud. Finally, a bare metal machine is identified to be added to the cluster by selecting among the bare-metal machines based on the autoscaling policy, the inventory and a best fit algorithm configured in accordance with a policy established by or on behalf of the customer.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a Continuation of, and claims the prioritybenefit of, U.S. patent application Ser. No. 16/908,042 filed on 22,Jun. 2020. The disclosure of the above-referenced application isincorporated herein by reference in their entirety for all purposes.

BACKGROUND

Cloud providers deliver cloud computing based services and solutions tobusinesses and/or individuals. Virtual hardware, software, andinfrastructure may be rented and provider-managed to deliver services inaccordance with a variety of cloud service models including Container asa Service (CaaS), Virtual Machine as a Service (VMaaS), Storage as aService (STaaS), and Bare Metal as a Service (BMaaS).

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments described here are illustrated by way of example, and not byway of limitation, in the figures of the accompanying drawings in whichlike reference numerals refer to similar elements.

FIG. 1 is a high-level block diagram conceptually illustrating adistribution of components of a system architecture of a managedcontainer service in accordance with an example embodiment.

FIG. 2 is a block diagram conceptually illustrating various functionalunits of a container SaaS portal in accordance with an exampleembodiment.

FIG. 3 is a block diagram conceptually illustrating various functionalunits of a CaaS controller in accordance with an example embodiment.

FIG. 4 illustrates data associated with a cluster item of a blueprintmeta-language or schema in accordance with an example embodiment.

FIG. 5 illustrates data associated with a blueprint item of a blueprintmeta-language or schema in accordance with an example embodiment.

FIG. 6 illustrates a cluster blueprint in accordance with an exampleembodiment.

FIG. 7 is a flow diagram illustrating CaaS controller processing inaccordance with an example embodiment.

FIG. 8 is a flow diagram illustrating best fit processing in accordancewith an example embodiment.

FIG. 9 is a flow diagram illustrating best fit processing in accordancewith another example embodiment.

FIG. 10 is a high-level flow diagram illustrating autoscaling processingin accordance with an example embodiment.

FIG. 11 is a flow diagram illustrating autoscaling processing involvingidentifying a bare metal machine to be added to a cluster in accordancewith an example embodiment.

FIG. 12 is a block diagram of a computer system in accordance with anembodiment.

DETAILED DESCRIPTION

Embodiments described herein are generally directed to a controller of amanaged container service that facilitates autoscaling based on baremetal machines available within a private cloud. In the followingdescription, numerous specific details are set forth in order to providea thorough understanding of example embodiments. It will be apparent,however, to one skilled in the art that embodiments described herein maybe practiced without some of these specific details.

As a practical manner, public cloud providers tend to have virtuallyinfinite pools of cloud machines. So, public cloud providers do not haveto deal with a number of issues that arise in the context of privateclouds. For example, CaaS on bare-metal infrastructure within anenvironment (e.g., a premises or co-location facility of anorganization, entity, or individual, for example, representing acustomer of the cloud provider and/or the CaaS) having a limited machineinventory in terms of the number and/or diversity of the types ofservers requires a bit more finesse than simply creating a virtualmachine based on an essentially limitless hardware pool. As such,autoscaling processing relating to a cluster (e.g., Kubernetes orDocker) within a limited-machine-inventory environment should take intoconsideration a variety of tradeoffs. For example, when multiple baremetal machines are available in the inventory that have resources (e.g.,in terms of processor, memory, network capacity, and/or storageperformance) in excess of a machine specification identified by theautoscaling policy associated with the cluster, one or more policy-basedconstraints (e.g., machine cost, cost of operation (power, cooling,etc.), performance, reliability (availability), security, etc.) definedby the cloud provider and/or a CaaS user or administrator may beemployed to identify a best fit for a new machine to be added to thecluster as a result of a scale out or scale up action. A similarapproach may also be used when removing a machine from a cluster, forexample, responsive to a scale in or scale down action.

While for sake of brevity embodiments described herein may focusprimarily on selection of bare metal machines in a limited machineinventory environment, the methodologies are equally applicable tocreation and management of hybrid clusters involving both physical andvirtual infrastructure and/or clusters spanning public and privateclouds.

Terminology

The terms “connected” or “coupled” and related terms are used in anoperational sense and are not necessarily limited to a direct connectionor coupling. Thus, for example, two devices may be coupled directly, orvia one or more intermediary media or devices. As another example,devices may be coupled in such a way that information can be passedthere between, while not sharing any physical connection with oneanother. Based on the disclosure provided herein, one of ordinary skillin the art will appreciate a variety of ways in which connection orcoupling exists in accordance with the aforementioned definition.

If the specification states a component or feature “may”, “can”,“could”, or “might” be included or have a characteristic, thatparticular component or feature is not required to be included or havethe characteristic.

As used in the description herein and throughout the claims that follow,the meaning of “a,” “an,” and “the” includes plural reference unless thecontext clearly dictates otherwise. Also, as used in the descriptionherein, the meaning of “in” includes “in” and “on” unless the contextclearly dictates otherwise.

Reference in the specification to “one embodiment” or “an embodiment”means that a particular feature, structure, or characteristic describedin connection with the embodiment is included in at least one embodimentof the invention. The appearances of the phrase “in one embodiment” invarious places in the specification are not necessarily all referring tothe same embodiment.

As used herein “cluster information” generally refers to informationindicative of resources desired for a cluster. In some embodiments,cluster information may include a specification from bare metal aspectsto container application aspects. For example, aspects specified bycluster information may include overall cluster parameters, machinetype, networking features, storage specifications, and servicedefinitions. In various embodiments described herein, the clusterinformation may be represented in the form of a cluster blueprint, whichmay be used to define the cluster specifics including compute, storageand networking and how these are to be assembled to build a completefunctional cluster (e.g., Kubernetes or Docker).

As used herein, an “excess resource metric” generally refers to a metricindicative of an existence of resources in excess of those required tosatisfy the needs of a cluster. For example, assuming a candidatemachine in a machine inventory having 10 processor cores and 1 Terabyte(TB) of memory, such a candidate machine would have both excessprocessing capacity and memory capacity in comparison to a new clusterrequest indicative of a need for a machine with 2 processor cores and128 Gigabytes (GB) of memory. Excess resource metrics may be used toquantify these excess resources in raw form (e.g., 8 excess processorcores and 872 GB excess memory) or may be normalized (e.g., 0.8 excessprocessing capacity and 0.872 excess memory capacity).

FIG. 1 is a high-level block diagram conceptually illustrating adistribution of components of a system architecture 100 of a managedcontainer service in accordance with an example embodiment. In variousembodiments described herein, the managed container service offersoperating system virtualization using containers (e.g., provides Dockercontainers and Kubernetes orchestration as a service) usinginfrastructure of a customer's private cloud (e.g., an on-premises datacenter or a colocation facility). The managed container service mayfacilitate deployment and operation of cloud native applications for avariety of use cases, including, but not limited to Edge, ArtificialIntelligence/Machine Learning (AI/ML), High Performance Compute (HPC).The managed container service may provide a fully-managed solution inwhich a managed service provider (MSP) operates CaaS instances andassist with the deployment and operation of customers' container-basedworkloads. According to one embodiment, cluster information may besupplied to a SaaS-based service (e.g., container SaaS portal 130) todefine cluster specifics including compute, storage and networking andhow these are to be assembled to build a complete functional cluster anda set of controllers (e.g., BMaaS controller 166, STaaS controller 156,VMaaS controller 146, and CaaS controller 160) carry out theinstantiation of the cluster in accordance with the cluster information.The resultant cluster may then be consumed by a user (e.g., one of CaaSusers 102) and managed by a cluster manager (e.g., container clustermanager 170).

In the context of the present example, components residing within aprivate cloud (e.g., an on-premises data center or a colocationfacility) are shown on the left and components residing within a publiccloud are shown on the right. In one embodiment, private cloudcomponents include infrastructure 110, the BMaaS controller 166, theSTaaS controller 156, the VMaaS Controller 146, a virtual machinemanager (VMM) 147, the CaaS controller 160, and the container clustermanager 170; and public cloud components include a bare metal SaaSportal 165, a storage SaaS portal 155, and the container SaaS portal130.

According to one embodiment, the container SaaS Portal 130 represents aweb-based portal in the form of a cloud hosted multi-tenant service thatallows creation of a physical cluster, a virtual cluster or a hybridcluster based on cluster information, for example, in the form ofcluster blue prints 105, which may be predefined or created by a CaaSadministrator 101 and/or CaaS users 102. In one embodiment, the use ofcluster blueprints 105 facilitates the creation by a user of a completefunctional cluster including compute, networking and storage resourcesas well as a set of applications to be deployed by simply referencing anexisting blueprint. A catalog of blueprints may be provided to allow auser to choose a blueprint from the catalog that matches their needs.For example, there may be predefined blueprints that allow for creationof Artificial Intelligence/Machine Learning (AI/ML) clusters as well asother predefined blueprints for general compute clusters. A non-limitingexample of a cluster blueprint is described below with reference to FIG.6.

Continuing with the present example, CaaS administrator 101 and CaaSusers 102 may make use of the container SaaS portal 130 to performvarious life-cycle management (LCM) operations relating to clusters(e.g., Kubernetes or Docker) that are based on the infrastructure 110,which may include physical and/or virtual infrastructure, includingnetworking infrastructure 111, storage infrastructure 112 and computeinfrastructure 113. The LCM operations may include initial computecluster creation, cluster modification in which infrastructure is addedto or removed from a cluster, cluster updates in which existinginfrastructure may be modified, and the destruction of a cluster. In oneembodiment, Application Programming Interfaces (e.g., RepresentationalState Transfer (REST) APIs) provided by the container SaaS portal 130support full LCM operations on clusters and are based on the OpenAPI(Swagger) definition. The status of cluster LCM operations may betracked from the container SaaS portal 130 or from the Kubernetescommand line, for example. The container SaaS portal 130 may also useREST to communicate with other services (e.g., the bare metal SaaSportal 165, the storage SaaS portal 155, and the VM SaaS portal 145)upon which it depends to obtain information about the infrastructure 110and/or and implement various tasks associated with the LCM operations.Further details regarding a non-limiting example of the container SaaSportal 130 are described below with reference to FIG. 2.

The bare metal SaaS portal 165 may represent a web-based portal in theform of a cloud hosted service of a particular BMaaS provider (which maybe the same or a different provider than the cloud provider) thatinteracts with the BMaaS controller 166 to carry out various aspects ofinstantiation of the cluster. For example, the BMaaS controller 166 maybe used to install the appropriate firmware and software onto a baremetal machine selected for inclusion with a cluster by the CaaScontroller 160.

Similarly, the storage SaaS portal 155 and the VM SaaS portal 145 mayrepresent web-based portals of respective STaaS and VMaaS providers usedby the customer and which are used to interface with the infrastructure110 via the STaaS controller 156 and the VMaaS controller 146,respectively. In one embodiment, the VMaaS controller 146 may make useof the VMM 147 to create appropriately sized control plane nodes to runa container control plane for the requested cluster. Advantages of thelayered approach implemented by system architecture 100 include enablingthe container SaaS portal 130 to be built on other “as a service”offerings (e.g., BMaaS, STaaS, and VMaaS) of the cloud provider or athird-party provider, facilitating extensibility to include otherofferings (e.g., networking and compute), as well as enabling thecreation of value-add services on top of CaaS or Kubernetes as a Service(KaaS). More or fewer types of infrastructure or providers may besupported depending upon the needs of the particular implementation, forexample, by adding or removing appropriate SaaS portals and associatedcontrollers.

In the context of the present example, the CaaS controller 160 runson-premises and is controlled by the container SaaS portal 130. In oneembodiment, the CaaS controller 160 may be a Kubernetes cluster and maybe controlled via kubectl API calls invoked by the container SaaS portal130. In such a scenario, the CaaS controller 160 is effectively abootstrap cluster that allows target clusters (e.g., clusters 120) to becreated and managed. In some embodiments, one or more of the BMaaScontroller 166, the STaaS controller 156, and the VMaaS controller 146may also be integrated into the bootstrap cluster, for example, using“kube-native” methods. Further details regarding a non-limiting exampleof the CaaS controller 160 are described below with reference to FIG. 3.

The container cluster manager 170 may be responsible for installing acontainer orchestration system on newly provisioned nodes. In oneembodiment, the container cluster manager 170 includes a “KubernetesEngine” (e.g., Hewlett Packard Enterprise (HPE) Container Platform,Rancher Kubernetes Engine (RKE), Loodse Kubermatic Container Engine,Google Kubernetes Engine (GKE), Kubernetes+Cluster API, or others) toinstall Kubernetes and create a cluster. After the cluster is created,the CaaS controller 160 may monitor the state of the cluster and cantake corrective action if needed. For example, if a machine fails in away that cannot be repaired, another machine can be allocated,provisioned and added to the cluster to replace the failed machine.

The various portals (e.g., bare metal SaaS portal 165, storage SaaSportal 155, VM SaaS portal, and container SaaS portal 130) andcontrollers (e.g., BMaaS controller 166, STaaS controller 156, VMaaScontroller 146, and CaaS controller 160) and the functionality performedby them may be implemented by hardware, software, firmware and/or acombination thereof. For example, the portals and controllers may beimplemented in the form of executable instructions stored on a machinereadable medium and executed by a processing resource (e.g., amicrocontroller, a microprocessor, central processing unit core(s), anapplication-specific integrated circuit (ASIC), a field programmablegate array (FPGA), and the like) and/or in the form of other types ofelectronic circuitry.

While for sake of simplicity various examples may be described withreference to a single customer or a single customer site (e.g.,on-premises datacenter or colocation facility), it is to be appreciatedthat the various portals described herein may interact with controllersassociated with multiple customers and/or distributed across multiplesites. Additionally, although in the present example, the controllersand SaaS portals are shown distributed between the private cloud andpublic cloud in a particular manner, depending upon the particularimplementation these components may be distributed differently. Forexample, one or more of the controllers (e.g., the CaaS controller 160)may be provided within a public cloud. Also, the same or differentsystem architectures (e.g., system architecture 100) may be implementedfor one or more customers of the cloud provider. It is furthercontemplated that various components of the system architecture may beimplemented by the same or different vendors or service providers. Forexample, a cloud provider that has one or more existing “as a service”offerings may leverage such existing offerings and/or may make use ofthird-party services.

FIG. 2 is a block diagram conceptually illustrating various functionalunits of a container SaaS portal 230 in accordance with an exampleembodiment. In the context of the present example, the container SaaSportal 230 includes a user interface 232, a CaaS REST API server 235, aCaaS worker 236, and a CaaS resource database 234. In one embodiment,the user interface 232 and the CaaS REST API server 235 represent anorthbound interface (or frontend) for accepting REST requests toperform Create, Read, Update and Delete (CRUD) operations on clusters inaccordance with cluster blueprints 205 and persists them in the CaaSresource database 234. For example, the CaaS REST API server 235 mayprovide self-service APIs for users (e.g., CaaS users 102) to createtheir own clusters (e.g., Kubernetes clusters) and administrator APIsfor CaaS administrators (e.g., CaaS administrator 101) to create andassign cluster to groups of users.

According to one embodiment, separation of concerns and scaling may beaddressed by implementing a backend in the form of one or more workers(e.g., the CaaS worker 236) of the container SaaS portal 230 that areresponsible for ensuring that operations requested via the RESTinterface of the container SaaS portal 230 are realized. In the contextof the present example, an internal inter-processor communication (IPC)mechanism (e.g., gRPC Remote Procedure Call (gRPC)) is utilized tocommunicate between the frontend and the backend and the CaaS worker 236may communicate information regarding cluster requests to the CaaScontroller (e.g., CaaS controller 160) via kubectl over Remote DataAccess (RDA).

In one embodiment, role-based access control (RBAC), for example,supported by identity provider 210, may be used to securely accommodatethe needs of different user personas. In this manner, for example,separation can be achieved between (i) cloud provider operations oradministrative personnel (e.g., CaaS administrator 101) that use thecontainer SaaS portal 230 to operate and manage customers' managedcontainer environments and (ii) customers' (tenants') self-service users(e.g., CaaS users 102) of the container SaaS portal 230 for CaaS and/orKaaS.

FIG. 3 is a block diagram conceptually illustrating various functionalunits of a CaaS controller 360 in accordance with an example embodiment.In the context of the present example, the CaaS controller 360 includesan API server 360, a cluster controller 362, container cluster managerinterfaces 363 a-n, a machine controller 364, and various providerinterfaces 365 a-n. The CaaS controller 360 and the functionalityperformed by the CaaS controller 360 them may be implemented byhardware, software, firmware and/or a combination thereof. For example,the CaaS controller 360 may be implemented in the form of executableinstructions stored on a machine readable medium and executed by aprocessing resource (e.g., a microcontroller, a microprocessor, centralprocessing unit core(s), an application-specific integrated circuit(ASIC), a field programmable gate array (FPGA), and the like) and/or inthe form of other types of electronic circuitry.

According to one embodiment, creation of a cluster involves selection orinput of cluster information 305 (e.g., in the form of a clusterblueprint (e.g., cluster blueprint 105)) via a CaaS SaaS portal (e.g.,container SaaS portal 130). The CaaS SaaS portal may control the CaaScontroller 360 via API calls (e.g., kubectl API calls) to the API server370. In the present example, the API server 370 provides Custom ResourceDefinitions (CRDs) (e.g., cluster CRD(s) 372 and machine CRD(s)) forvarious objects supported by the managed container service, including,for example, a cluster, a machine, a machine set, and a machinedeployment. Depending upon the particular implementation, the CRDs maybe based on Kubernetes community “Cluster API” CRDs.

Cluster objects may provide a high level description of their respectiveclusters including an Internet Protocol (IP) address, Domain NameService (DNS) information, and the like. In one embodiment, machineobjects are agnostic to physical versus virtual machines and includeprovider-specific details for the desired machines. Machine set objectsmay be supported to allow specification of a set of multiple machines.Machine deployment objects may be used to automate upgrades.

Responsive to the cluster CRD(s) 372, the cluster controller 362 maydirect cluster operations to an appropriate container cluster managerinterface 363 a-n. For example, depending upon a cluster specificationindicated within the cluster information 305, the cluster controller 362may use container cluster manager interface 363 a to interact with anRKE Kubernetes distribution or container cluster manager interface 363 nto interact with another type of Kubernetes engine.

Similarly, machine controller 364 may be responsible for directingmachine operations to an appropriate provider interface 365 a-n.Depending upon a machine specification indicated within the clusterinformation 305, the machine controller 364 may use BM providerinterface 365 a to interact with a BMaaS provider (e.g., via BMaaS APIsassociated with a bare metal SaaS portal (e.g., bare metal SaaS portal165)) and VM provider interface 365 n to interact with a VMaaS provider(e.g., via VMaaS APIs associated with a VM SaaS portal (e.g., VM SaaSportal 145)). For example, machine controller 364 may utilize Terraformproviders for infrastructure (e.g., BMaaS, VMaaS or any IaaS) andAnsible playbooks to manage installed OS components (e.g., Docker,agents, base configurations, and initial Helm charts).

FIG. 4 illustrates data associated with a cluster item 400 of ablueprint meta-language or schema in accordance with an exampleembodiment. In various embodiments described herein, declarative modelsmay be used for cluster LCM using cluster blueprints (e.g., clusterblueprints 105 or 205). In one embodiment, a blueprint meta-language(e.g., JavaScript Object Notation (JSON), YAML Ain′t Markup Language(YAML), and/or the Terraform language) or schema, includes (i) thecluster blueprint, (ii) machine blueprints defining different types ofcompute resources to be used as part of the cluster blueprint; (iii)networking blueprints defining networking topologies and features forthe cluster; (iv) storage blueprints defining storage to be used withinthe cluster; and (v) service blueprints defining services to bepre-installed on a newly created cluster.

In the context of the present example, cluster item 400 includes an ID,a name, a blueprintID, a createdDate, a lastUpdateDate, and a state. TheID may be a string representing a unique identifier (e.g., a UniversallyUnique Identifier (UUID)) for the cluster. The name may be a stringrepresenting a user-assigned name to the cluster and which may bedisplayed in the catalog, for example. The blueprintID may be a stringrepresenting a unique identifier (e.g., a UUID) for a blueprint itemassociated with the cluster. The createdDate may indicate the date andtime at which the cluster was created and may be represented in the formof a string. The lastUpdateDate may indicate the date and time at whichthe cluster was last updated and may be represented in the form of astring. The state, for example, monitored and updated by a CaaScontroller (e.g., CaaS controller 160) may be selected from a predefinedset of enumerated values (e.g., pending, ready, error, or offline) andmay be represented in the form of a string.

FIG. 5 illustrates data associated with a blueprint item 500 of ablueprint meta-language or schema in accordance with an exampleembodiment. The blueprint item 500 may declaratively describe thedesired cluster, for example, including master and worker node sizes,amounts, and quality attributes (e.g., availability and performance).Cluster blueprints may also define required storage and networkingcharacteristics as well as other curated services to deploy, forexample, cluster and workload observability services. Depending upon theparticular implementation, cluster blueprints may also includeservice-specific representations of desired state as well as otherwell-known representations (e.g., Terraform infrastructure plans).

In the context of the present example, blueprint item 500 includes anID, a name, a version, a k8sVersion, a createdDate, a lastUpdateDate, amachine specification, a cluster specification, a storage specification,and information regarding desired master and worker nodes. As describedabove with reference to the cluster item, the ID may be a stringrepresenting a unique identifier (e.g., a UUID) for the blueprint. Thename may be a string representing a user-assigned name to the blueprintand which may be displayed in the catalog, for example. The createdDatemay indicate the date and time at which the blueprint was created andmay be represented in the form of a string. The lastUpdateDate mayindicate the date and time at which the blueprint was last updated andmay be represented in the form of a string. The machine specificationmay include information indicative of the provider for the desiredmachine. The cluster specification may include information indicative ofthe desired container cluster manager (e.g., container cluster manager170), for example, the desired Kubernetes engine. The storagespecification may include information indicative of a type of storageinfrastructure (e.g., storage infrastructure 112) to be used in thecluster.

FIG. 6 illustrates a cluster blueprint 605 in accordance with an exampleembodiment. In the context of the present example, cluster blueprint 605defines a Kubernetes cluster to be created via RKE having one smallmaster node and one medium bare metal-based worker node.

The various portals and controllers described herein and the processingdescribed below with reference to the flow diagrams of FIGS. 7-9 may beimplemented in the form of executable instructions stored on a machinereadable medium and executed by a processing resource (e.g., amicrocontroller, a microprocessor, central processing unit core(s), anapplication-specific integrated circuit (ASIC), a field programmablegate array (FPGA), and the like) and/or in the form of other types ofelectronic circuitry. For example, the processing may be performed byone or more virtual or physical computer systems of various forms, suchas the computer system described with reference to FIG. 10 below.

FIG. 7 is a flow diagram illustrating CaaS controller processing inaccordance with an example embodiment. In the context of the presentexample, a cloud provider may have been engaged by a particular customeror multiple customers to provide and support a managed container servicethat makes use of their private cloud infrastructure, for example,including bare metal servers.

At block 710, cluster information associated with a request to create acontainer cluster on behalf of a customer is received by a CaaScontroller. According to one embodiment, the CaaS controller (e.g., CaaScontroller 160) runs within a customer's private cloud, for example, onon-premises infrastructure or infrastructure within a colocationfacility used by the customer. The CaaS controller may receive thecluster information in the form of a cluster blueprint (e.g., clusterblueprint 105) from a container SaaS portal (e.g., container SaaS portal130) running in the same or a different private or public cloud as theCaaS controller. Depending upon the particular implementation thecluster information may declaratively describes the desired cluster. Forexample, a cluster blueprint may be selected by a CaaS user (e.g., CaaSuser 102) from a predefined set of cluster blueprints presented via auser interface (e.g., user interface 232) in which the selected clusterblueprint includes master and worker node sizes, amounts, and qualityattributes (e.g., availability and/or performance). Cluster blueprintsmay also define desired storage and networking characteristics as wellas other curated services to deploy, for example cluster and workloadobservability services. Cluster blueprints may also includesystem-specific representations of desired state as well as otherwell-known representations (e.g., Terraform infrastructure plans).

At block 720, an inventory of bare metal machines available within aprivate cloud of the customer is received via a BMaaS provider.According to one embodiment, the inventory contains real-timeinformation indicative of respective resources (e.g., a number ofprocessor cores, an amount of memory, network capacity, and/or storageperformance) for one or more types of infrastructure (e.g.,infrastructure 110), including a set of bare metal machines, that arecurrently available (e.g., are not currently deployed for use by anothercluster) for use in connection with supporting the managed containerservice. Depending upon the particular implementation, the inventory maybe requested from the BMaaS provider by the CaaS controller directly(e.g., via a bare metal SaaS portal of the BMaaS provider) or indirectly(e.g., via the CaaS portal).

In various embodiments, the inventory may include or otherwise be mappedto metadata or other information associated with the available baremetal machines for use in connection with prioritizing, guiding,directing or otherwise influencing machine selection, for example, byoptimizing, minimizing, or maximizing various factors or conditions,Non-limiting examples of the metadata or other information includeinformation indicative of one or more of machinecharacteristics/attributes (e.g., cost, power consumption, heat,performance, security, reliability, etc.) in the form of relative orabsolute metrics/ratings or raw or normalized data.

At block 730, a bare metal machine is identified for the cluster basedon the inventory received in block 720, the cluster information receivedin block 710, and a best fit algorithm configured in accordance with apolicy established by or on behalf of the customer. Despite the customerhaving a variety of bare metal machine configurations, it is unlikelythe customer will have a sufficient number of such configurations toprecisely match the range of all potential cluster requests. For thesake of example, suppose the managed container service uses fourenumerated sizes (Small, Medium, Large, Extra Large) for four resources:processor, memory, network capacity, and storage performance. In thisexample, there are 256 combinations of the resources, but it is unlikelythat the customer will have 256 different machine configurations tochoose from and the number of possibilities grows very rapidly as theenumerated categories increase and/or as resources are added. Because itis impractical for a customer to attempt to have bare metal machineconfigurations that meet every possible machine specification that maybe desired by a CaaS user, it is desirable to have a machine selectionprocess to facilitate selection of an appropriate machine from theavailable inventory to satisfy the user's request. For example, while anumber of the available machines may have sufficient resources to meetthe needs indicated by the user's request, some of the machines may haveone or more types of resources in excess of those needed by the desiredcluster or may be likely to be needed to service other cluster requests.As such, embodiments described herein provide a policy-based approach toallow the cloud provider and/or the customer to express one or moremachine-selection priorities to be applied as part of a best fitalgorithm. Non-limiting examples of best fit processing that may be partof the best fit algorithm are described below with reference to FIGS. 8and 9.

In the context of various examples described herein, the CaaS user mayspecify the desired cluster in a form in which resources are describedat a reasonably high level. While it is possible to have the userspecify a machine with particularity, for example, a particular model ofa particular manufacturer with a particular type of processor, aspecific amount of memory, and a particular type of Graphics ProcessingUnit (GPU), it is typically more efficient for a user to specify amachine based on something more abstract. Depending upon the particularmanner in which the machines are categorized, an internal mapping ofthese categories (e.g., sizes) to the reality presented to the user maybe utilized as part of the machine selection process.

FIG. 8 is a flow diagram illustrating best fit processing in accordancewith an example embodiment. At block 810, a set of candidate machines iscreated based on the inventory. According to one embodiment, the set ofcandidate machines is a subset of the available bare metal machines inthe inventory that have sufficient resources to satisfy the clusterrequest. That is, each candidate machine in the set of candidatemachines has resources equal to or greater that indicated by therequest. Identification of the candidate set may involve using aninternal mapping of machine categories to corresponding resource rangesto transform a cluster request expressed in terms of a machine categoryto explicit quantities of resources. Then, for each available machine inthe inventory the amount of each type of resource needed to satisfy therequest may be compared to corresponding amounts of each type ofresource of the machine configuration to determine whether to add theavailable machine to the candidate set.

At block 820, an excess resource metric for each candidate machine inthe set of candidate machines is calculated. According to oneembodiment, the excess resource metric may be calculated concurrentlywith the identification of the candidate set. Alternatively, the excessresource metric may be performed after the candidate set has beencompleted. The calculation may involve subtracting the amount ofresources needed to satisfy the request from those available as part ofa particular machine configuration and aggregating or averaging theresults for each type of resource into a single excess resource metric.Alternatively, the excess resource metric may comprise multiplecomponents—one for each type of resource.

At block 830, a bare metal machine in the set of candidate machineshaving the excess resource metric indicative of a least amount of excessresources is selected for the cluster.

FIG. 9 is a flow diagram illustrating best fit processing in accordancewith another example embodiment. In the context of the present example,additional information is assumed to be available to assist the machineselection process. For example, information regarding a lifespan of thecluster request may be included as part of the cluster request orlearned based on historical data. Additionally, information thatquantifies a probability metric at the machine-level or theresource-level that is indicative of a probability that a machine orresource will be needed to satisfy a subsequent request during thelifespan may be included with the inventory or learned based onhistorical data. Blocks 910 and 920 may be as described above withreference to FIG. 8. At block 930, the bare metal machine is selectedfrom the set of candidate machines having an excess resource metricindicative of a least amount of excess resources and that also minimizesa probability the excess resources will be needed to satisfy anotherrequest during the lifetime of the request. According to one embodiment,this minimization involves minimizing the sum of the probability metricsof the excess resources of the selected bare metal machine. Inalternative embodiments, the minimization may be performed as themachine-level to minimize the probability the selected machine will beneeded to satisfy another request during the lifetime of the request.

While for sake of brevity some examples of a machine selection approachhave been provided above with reference to FIGS. 8 and 9, those skilledin the art will appreciate the applicability of the methodologiesdescribed herein extend beyond these particular examples. For example,to the extent metadata or information associated with bare metalmachines is available that is indicative of their relative power usage,security, reliability, and/or other factors that may be desirable as thebasis on which to prioritize machine selection, such metadata orinformation may be taken into consideration by the machine selectionprocess. Furthermore, in some embodiments, machine learning and/or bigdata analytics may be used by the CaaS controller to reveal patternsand/or probabilities of cluster request for users, workloads, machinesand/or resources. Since the provider manages the site and therefore hasinsight into, among other things, the users, cluster requests made byparticular users over time, machine demand and usage over time, and whatis being run on the private cloud infrastructure, historical data may beused alone or in combination with machine learning to assist the machineselection process, For example, the managed CaaS system may “learn” thata particular user commonly requests bigger machines than necessary forthe workload at issue and a result the managed CaaS system may allocatea machine that is slightly smaller than requested by the particularuser. Similarly, the managed CaaS system may observe a pattern that theparticular user tends to request undersized machines and proactivelyoffer the particular user the option to select a larger machine.Alternatively or additionally, the managed CaaS system may take intoconsideration machine and/or resource demand/usage patterns so as tooptimize allocation of machines in a manner that increases thelikelihood of machine availability for anticipated workload demands andhence profitability of the managed CaaS for the cloud provider.

Additional machine selection examples include, but are not limited to:

-   -   Using information regarding security vulnerability of particular        machine configurations, operating systems, application programs        and/or combinations thereof to guide the machine selection        process    -   Using of machine learning to optimize machine configurations,        operating system parameters, application programs and/or        combinations thereof for commonly observed workloads    -   Using characteristics that affect availability such as ensuring        that machines providing redundancy have independent power        connections and network paths    -   Ensuring workloads that might require significant power are        placed on machines in locations with favorable cooling (e.g.,        put jobs that are likely to run hot on machines that are right        over the air conditioner vents)    -   Using performance characteristics to optimize the performance of        the resources allocated. For example, if there are multiple        speed networks in the data center, ensuring workloads that        require significant network bandwidth are allocated on high        speed networks.    -   Cost of operations—some machines require more power and cooling        to perform the same work, and for workloads that require        significant power, placing them on machines that have lower        power requirements.    -   Reliability— Some machines might have better track records    -   Excess capacity— If certain workloads are more likely than        others to grow, potential future disruptions may be avoided by        putting such workloads on bigger machines.

FIG. 10 is a high-level flow diagram illustrating autoscaling processingin accordance with an example embodiment. In the context of the presentexample, cloud-bursting or workload shifting techniques may be used tofacilitate handling of various autoscaling actions (e.g., scale out toadd a component to a cluster to spread out the load among morecomponents, scale up a component of the cluster, for example, to makethe component bigger or faster so that the component can handle moreload, scale in to remove a component from a cluster to consolidate theload among fewer components, and scale down a component of the cluster,for example, to make the component smaller or slower so that thecomponent can more cost efficiently handle less load). In oneembodiment, an autoscaling policy may be provided at the time of clustercreation as part of cluster information (e.g., cluster blueprint 105)and may specify conditions (e.g., upper bounds and/or lower bounds formetric values or statistics and corresponding time periods) fortriggering corresponding autoscaling actions (e.g., scale out or scalein the machine instances of a cluster by a particular amount). In oneembodiment, an autoscaling policy may include a rule (e.g., aconditional expression) and a corresponding autoscaling action to beperformed when the rule is satisfied (e.g., when the conditionalexpression is true). For example, an autoscaling policy for a particularcluster may indicate an additional small bare metal machine is to beadded to the cluster responsive to any central processing unit (CPU) ofan existing bare metal machine in the cluster exceeding an average of90% utilization over a period of 1 hour. Those skilled in the art willappreciate instantaneous values of metrics or various other statistics(e.g., mean, maximum, minimum, standard deviation, percentiles, or thelike) for metrics may be used to define autoscaling policies.

At block 1010, various metrics associated with operation of clusters aremonitored. According to one embodiment, a CaaS controller (e.g., CaaScontroller 160) periodically retrieves metrics (e.g., processingresource utilization, central processing unit (CPU) utilization,graphical processing unit (GPU) utilization, memory utilization, inboundand/or outbound network traffic, Input/Output Operations Per Second(IOPs), raw speed, latency, redundancy, disk Input/Output (I/O), andtransaction count) from the container orchestration system (e.g.,Kubernetes, Docker Swarm, or the like) in which the clusters are runningvia an appropriate container cluster manager (e.g., container clustermanager 170). Alternatively, the container cluster manager mayperiodically push such metrics to the CaaS controller.

At decision block 1020, a determination is made regarding whether anautoscaling policy has been triggered. According to one embodiment, theautoscaling policy is evaluated with reference to the metrics obtainedin block 1010. For example, an appropriate value or statistical measurerelating to appropriate metrics may be used to evaluate one or morerules of the autoscaling policy. When a rule is satisfied (e.g., whenthe conditional expression is true), processing continues with decisionblock 1030; otherwise, processing loops back to block 1010 to continuemonitoring.

At decision block 1030, an autoscaling action is identified. Accordingto one embodiment, the autoscaling action may be one of scale out, scaleup, scale down, or scale in and is In the context of the presentexample, when the autoscaling action corresponding to the rule that hasbeen determined to be satisfied is scale out, then processing continueswith block 1040. When the corresponding autoscaling action is scale up,then processing continues with block 1050. When the rule triggers anscale down autoscaling action, then processing continues with block1060. When the corresponding autoscaling action is scale in, thenprocessing continues with block 1070.

At block 1040, a new machine to be added to the cluster is identified.According to one embodiment, the scale out action identifies a type andquantity of machines to be added to the cluster at issue. Assuming it isa bare metal machine that is to be added to the cluster, based on thetype and quantity of bare metal machines that are to be added, a machineselection process may be performed to identify the new machine(s).Non-limiting examples of machine selection processes that may be usedhave been described above with reference to FIGS. 8 and 9. According toone embodiment, a scale out action may involve cloud-bursting orworkload shifting to deal with peaks in demand. For example, a workloadmay be performed within a private cloud of a customer and burst to apublic cloud when needed to meet peak demands in excess of those capableof being satisfied with on-premises infrastructure (e.g., infrastructure110). In this manner, customers of the managed container service may beprovided with flexible capacity as needed.

At block 1050, an existing machine in the cluster that is to be replacedwith a “larger” machine is identified. According to one embodiment, theexisting machine may be a machine containing a resource whose metric(s)triggered the scale up action. For example, an existing “small” machinemay be replaced with a machine of the next size up (e.g., a “medium”machine). Alternatively, if the machine containing the resource whosemetric(s) triggered the scale up action is already the largestmachine-size available, then a different machine (e.g., a “small” or“medium” machine) within the cluster may be identified for replacement.In some embodiments, as described above with reference to handling of ascale out action, a scale up action may involve replacing private cloudinfrastructure (e.g., a bare metal machine located on-premises) with aphysical or virtual machine in a public cloud. In other examples, theexisting machine and the “larger” machine may both be bare metalmachines in on-premises infrastructure inventory.

At block 1060, an existing machine in the cluster to be replaced with a“smaller” machine is identified. According to one embodiment, theexisting machine may be a machine containing a resource whose metric(s)triggered the scale down action. For example, an existing “large”machine may be replaced with a machine of the next size down (e.g., a“medium” machine). Alternatively, if the machine containing the resourcewhose metric(s) triggered the scale down action is already the smallestmachine-size available, then a different machine (e.g., a “medium”machine) within the cluster may be identified for replacement. In someembodiments, preference may be given to first reducing utilization ofpublic cloud infrastructure. For example, a scale down action mayinvolve replacing a physical or virtual machine in a public cloud withprivate cloud infrastructure (e.g., a bare metal machine locatedon-premises). In other examples, the existing machine and the “smaller”machine may both be bare metal machines in the on-premisesinfrastructure inventory.

At block 1070, a machine to be removed from the cluster is identified.According to one embodiment, the scale in action identifies a type andquantity of machines to be removed from the cluster at issue. Assumingit is a bare metal machine that is to be removed from the cluster, basedon the type and quantity of bare metal machines that are to be removed,a machine selection process similar to that described above withreference to FIGS. 8 and 9 may be used, for example, to identify amachine to be removed that will result in minimization of excessresources of the remaining machines in the cluster. According to oneembodiment, a scale in action may involve reversing cloud-bursting orworkload shifting that has previously been performed to deal with peaksin demand. For example, after a peak in demand has past utilization ofpublic cloud infrastructure by the cluster may first be reduced beforereducing private cloud infrastructure (e.g., a bare metal machinelocated on-premises). In this manner, the cloud provider may meter acustomer's actual usage of public cloud infrastructure and only bill thecustomer for what they use, thereby providing flexible and on-demandcapacity when insufficient resources are available in the customer'sprivate cloud.

In addition or as an alternative to the various optimizations, machinelearning and big data analytic approaches described herein, in someembodiments, identifying a new bare metal machine to add to a cluster(e.g., responsive to a scale out or scale up action that calls for a newmachine to be identified to add to a cluster or to replace an existing“smaller” machine in a cluster), machine learning or big data analyticsmay be used to intelligently select a particular machine configuration(of the limited machines available in the inventory) that has evidencedan ability perform well for the workload at issue. Similarly, withrespect to a scale in or scale down action that calls for an existingmachine within a cluster to be identified for removal or replacement bya “smaller” machine, identification of the existing machine to removefrom the cluster, may involve, for example, identifying a particularmachine configuration that does not perform as well as other machines inthe cluster for the particular workload(s) at issue.

While the above example is described in the context of “auto scaling,”the methodologies described herein are thought to be equally applicableto “auto placement.” So, for example, when evaluating whether to movework from one machine (M1), which may not be the most efficient for thework at issue for one or more reasons, to another (M2), which is deemedbetter for the work at issue, a copy of instances running on M1 may bebrought up on M2 and then M1 may be shut down.

While the above example is described with reference to various metricsassociated with infrastructure, those skilled in the art will appreciatea variety of other metrics from the environment may also be used. Forexample, the various metrics associated with operation of clustersevaluated in decision block 1020 may include, among other potentialmetrics, temperature, cooling cost (machines near vents might be cheaperto cool than those far away), and/or power consumption (some machinestypes require less power per unit work), power cost (a largeinstallation might have more than one power source).

For sake of illustrating a concrete example of how policy-basedconstraints may influence auto scaling, consider an autoscaling policythat indicates a “large” worker node should be added when CPUutilization of any existing worker node is greater than 70% over aperiod of 1 hour and a policy-based constraint associated with thecluster expresses a desire to minimize power costs. In such a scenario,when the autoscaling policy is triggered to scale out the cluster, whenevaluating candidate machines for the new worker node to be added to thecluster, a particular power source associated with candidate machinesmay be taken into consideration. Similar policy-based constraints may betaken into consideration for scale up, scale in, and/or scale downactions.

FIG. 11 is a flow diagram illustrating autoscaling processing involvingidentifying a bare metal machine to be added to a cluster in accordancewith an example embodiment. At block 1110, a metric is monitored that isassociated with operation of a cluster deployed on behalf of a customerwithin a container orchestration system. Non-limiting examples of themetric include central processing unit (CPU) utilization, memoryutilization, inbound and/or outbound network traffic, latency, diskInput/Output (I/O), and transaction count. Depending upon the particularimplementation, the metric may be pulled by a CaaS controller (e.g.,CaaS controller 160) periodically or on demand from the containerorchestration system (e.g., Kubernetes, Docker Swarm, or the like) inwhich the cluster at issue is running via an appropriate containercluster manager (e.g., container cluster manager 170). Alternatively,the container cluster manager may push such the metric to the CaaScontroller.

At block 1120, responsive to a scaling event being identified for thecluster based on the monitoring and an autoscaling policy associatedwith the cluster, a BMaaS controller running within a private cloud ofthe customer is caused to create or update an inventory of bare metalmachines available within the private cloud. According to oneembodiment, the inventory contains real-time information indicative ofrespective resources (e.g., a number of processor cores, an amount ofmemory, network capacity, and/or storage performance) for one or moretypes of infrastructure (e.g., infrastructure 110), including a set ofbare metal machines, that are currently available (e.g., are notcurrently deployed for use by another cluster) for use in connectionwith supporting the managed container service. Depending upon theparticular implementation, the inventory may be requested from a BMaaSprovider by the CaaS controller directly (e.g., via a bare metal SaaSportal of the BMaaS provider) or indirectly (e.g., via the CaaS portal).

At block 1130, a bare metal machine is identified for addition to thecluster based on the inventory received in block 1120, the autoscalingpolicy at issue, and a best fit algorithm configured in accordance witha policy established by or on behalf of the customer. According to oneembodiment, the type and number of machines desired to be added to thecluster are identified by the autoscaling rule of the autoscaling policythat has been triggered. Alternatively, the type of machine to be addedmay be selected to be consistent with other existing machines that arealready part of the cluster. Despite the customer presumably having avariety of bare metal machine configurations represented within theirprivate cloud, it is unlikely the customer will have a sufficient numberof such configurations to precisely match the range of all potentialcluster requests. As such, embodiments described herein provide apolicy-based approach to allow the cloud provider and/or the customer toexpress one or more machine-selection priorities to be applied as partof a best fit algorithm. Non-limiting examples of best fit processingthat may be part of the best fit algorithm are described above withreference to FIGS. 8 and 9.

Embodiments described herein include various steps, examples of whichhave been described above. As described further below, these steps maybe performed by hardware components or may be embodied inmachine-executable instructions, which may be used to cause ageneral-purpose or special-purpose processor programmed with theinstructions to perform the steps. Alternatively, at least some stepsmay be performed by a combination of hardware, software, and/orfirmware.

Embodiments described herein may be provided as a computer programproduct, which may include a machine-readable storage medium tangiblyembodying thereon instructions, which may be used to program a computer(or other electronic devices) to perform a process. The machine-readablemedium may include, but is not limited to, fixed (hard) drives, magnetictape, floppy diskettes, optical disks, compact disc read-only memories(CD-ROMs), and magneto-optical disks, semiconductor memories, such asROMs, PROMs, random access memories (RAMs), programmable read-onlymemories (PROMs), erasable PROMs (EPROMs), electrically erasable PROMs(EEPROMs), flash memory, magnetic or optical cards, or other type ofmedia/machine-readable medium suitable for storing electronicinstructions (e.g., computer programming code, such as software orfirmware).

Various methods described herein may be practiced by combining one ormore machine-readable storage media containing the code according toexample embodiments described herein with appropriate standard computerhardware to execute the code contained therein. An apparatus forpracticing various example embodiments described herein may involve oneor more computing elements or computers (or one or more processorswithin a single computer) and storage systems containing or havingnetwork access to computer program(s) coded in accordance with variousmethods described herein, and the method steps of various exampleembodiments described herein may be accomplished by modules, routines,subroutines, or subparts of a computer program product.

FIG. 12 is a block diagram of a computer system in accordance with anembodiment. In the example illustrated by FIG. 12, computer system 1200includes a processing resource 1210 coupled to a non-transitory, machinereadable medium 1220 encoded with instructions to perform a privatecloud gateway creation processing. The processing resource 1210 mayinclude a microcontroller, a microprocessor, central processing unitcore(s), an ASIC, an FPGA, and/or other hardware device suitable forretrieval and/or execution of instructions from the machine readablemedium 1220 to perform the functions related to various examplesdescribed herein. Additionally or alternatively, the processing resource1210 may include electronic circuitry for performing the functionalityof the instructions described herein.

The machine readable medium 1220 may be any medium suitable for storingexecutable instructions. Non-limiting examples of machine readablemedium 1220 include RAM, ROM, EEPROM, flash memory, a hard disk drive,an optical disc, or the like. The machine readable medium 1220 may bedisposed within the computer system 1200, as shown in FIG. 12, in whichcase the executable instructions may be deemed “installed” or “embedded”on the computer system 1200. Alternatively, the machine readable medium1220 may be a portable (e.g., external) storage medium, and may be partof an “installation package.” The instructions stored on the machinereadable medium 1220 may be useful for implementing at least part of themethods described herein.

In the context of the present example, the machine readable medium 1220is encoded with a set of executable instructions 1230-1250. It should beunderstood that part or all of the executable instructions and/orelectronic circuits included within one block may, in alternateimplementations, be included in a different block shown in the figuresor in a different block not shown.

Instructions 1230, upon execution, cause the processing resource 1210 tomonitor a metric associated with operation of a cluster deployed onbehalf of a customer within a container orchestration system. In oneembodiment, instructions 1230 may correspond generally to instructionsfor performing block 1110 of FIG. 11.

Instructions 1240, upon execution, cause the processing resource 1210to, responsive to a scaling event being identified for the cluster basedon the monitoring and an autoscaling policy associated with the cluster,cause a BMaaS controller running within a private cloud of the customerto create an inventory of bare metal machines available within theprivate cloud. In one embodiment, instructions 1240 may correspondgenerally to instructions for performing block 1120 of FIG. 11.

Instructions 1250, upon execution, cause the processing resource 1210 toidentify a bare metal machine to be added to the cluster based on theinventory, the autoscaling policy at issue and a best fit algorithmconfigured in accordance with a policy established by the customer. Inone embodiment, instructions 1250 may correspond generally toinstructions for performing the block 1130 of FIG. 11.

In the foregoing description, numerous details are set forth to providean understanding of the subject matter disclosed herein. However,implementation may be practiced without some or all of these details.Other implementations may include modifications and variations from thedetails discussed above. It is intended that the following claims coversuch modifications and variations.

What is claimed is:
 1. A system comprising: a processing resource; and anon-transitory computer-readable medium, coupled to the processingresource, having stored therein instructions that when executed by theprocessing resource cause the processing resource to: monitor a metricassociated with operation of a cluster deployed on behalf of a customerof a managed container service within a container orchestration system;responsive to a scaling event being identified for the cluster based onthe monitoring and an autoscaling policy associated with the cluster,cause a Bare-Metal-as-a-Service (BMaaS) provider associated with theprivate cloud to create an inventory of a plurality of bare-metalmachines available within the private cloud; and identify a bare metalmachine to be added to the cluster by selecting among the plurality ofbare-metal machines based on the autoscaling policy, the inventory and abest fit algorithm configured in accordance with a policy established byor on behalf of the customer.
 2. The system of claim 1, wherein theinstructions further cause the processing resource to receive theautoscaling policy as part of a definition of the cluster from a user ofthe customer, wherein the definition of the cluster includes informationregarding a machine configuration desired to be used by the cluster. 3.The system of claim 1, wherein the scaling event comprises triggering ofa scale out action of a rule of the autoscaling policy.
 4. The system ofclaim 4, wherein the instructions further cause the processing resourceto provide the customer with flexible capacity by using cloud-burstingor workload shifting when demand associated with the cluster exceeds acapacity of the private cloud.
 5. The system of claim 1, wherein thepolicy expresses a goal of minimizing excess resources of the pluralityof bare metal machines, wherein the inventory includes information thatquantifies a value of each resource for each of the plurality ofbare-metal machines, and wherein the instructions further cause theprocessor to: identify a subset of the plurality of bare-metal machineshaving a type and a quantity of resources that satisfy a machinespecification identified by the autoscaling policy; for each machine inthe subset, computing an excess resource metric based on an amount ofresources of the machine that are in excess of resources required tosatisfy the machine specification; and selecting as the particular baremetal machine a machine of the subset having the excess resource metricindicative of a least amount of excess resources.
 6. The system of claim1, wherein the instructions further cause the processing resource torequest the BMaaS provider to create the inventory via a BMaaS portalassociated with the BMaaS provider.
 7. The system of claim 6, whereindeployment of the cluster was responsive to a request received via aContainer-as-a-Service (CaaS) portal.
 8. The system of claim 7, whereinthe CaaS portal and the BMaaS portal are operable within a public cloud.9. The system of claim 1, wherein the system comprises a CaaS controlleroperable within the public cloud.
 10. The system of claim 1, wherein thesystem comprises a CaaS controller operable within the private cloud.11. A non-transitory machine readable medium storing instructions thatwhen executed by a processing resource of a computer system cause theprocessing resource to: monitor a metric associated with operation of acluster deployed on behalf of a customer of a managed container servicewithin a container orchestration system; responsive to a scaling eventbeing identified for the cluster based on the monitoring and anautoscaling policy associated with the cluster, cause aBare-Metal-as-a-Service (BMaaS) provider associated with the privatecloud to create an inventory of a plurality of bare-metal machinesavailable within the private cloud; and identify a bare metal machine tobe added to the cluster by selecting among the plurality of bare-metalmachines based on the autoscaling policy, the inventory and a best fitalgorithm configured in accordance with a policy established by or onbehalf of the customer.
 12. The non-transitory machine readable mediumof claim 11, wherein the instructions further cause the processingresource to receive the autoscaling policy as part of a definition ofthe cluster from a user of the customer, wherein the definition of thecluster includes information regarding a machine configuration desiredto be used by the cluster.
 13. The non-transitory machine readablemedium of claim 11, wherein the scaling event comprises triggering of ascale out action of a rule of the autoscaling policy.
 14. Thenon-transitory machine readable medium of claim 13, wherein theinstructions further cause the processing resource to provide thecustomer with flexible capacity by using cloud-bursting or workloadshifting when demand associated with the cluster exceeds a capacity ofthe private cloud.
 15. The non-transitory machine readable medium ofclaim 11, wherein the policy expresses a goal of minimizing excessresources of the plurality of bare metal machines, wherein the inventoryincludes information that quantifies a value of each resource for eachof the plurality of bare-metal machines, and wherein the instructionsfurther cause the processor to: identify a subset of the plurality ofbare-metal machines having a type and a quantity of resources thatsatisfy a machine specification identified by the autoscaling policy;for each machine in the subset, computing an excess resource metricbased on an amount of resources of the machine that are in excess ofresources required to satisfy the machine specification; and selectingas the particular bare metal machine a machine of the subset having theexcess resource metric indicative of a least amount of excess resources.16. The non-transitory machine readable medium of claim 11, wherein theinstructions further cause the processing resource to request the BMaaSprovider to create the inventory via a BMaaS portal associated with theBMaaS provider.
 17. A method comprising: monitoring, by a processingresource of a Container-as-a-Service (CaaS) controller of a managedcontainer service, a metric associated with operation of a clusterdeployed on behalf of a customer of the managed container service withina container orchestration system; responsive to a scaling event beingidentified for the cluster based on the monitoring and an autoscalingpolicy associated with the cluster, causing, by the processing resource,a Bare-Metal-as-a-Service (BMaaS) provider associated with the privatecloud to create an inventory of a plurality of bare-metal machinesavailable within the private cloud; and identifying, by the processingresource, a bare metal machine to be added to the cluster by selectingamong the plurality of bare-metal machines based on the autoscalingpolicy, the inventory and a best fit algorithm configured in accordancewith a policy established by or on behalf of the customer.
 18. Themethod of claim 17, further comprising receiving, by the processingresource, the autoscaling policy as part of a definition of the clusterfrom a user of the customer, wherein the definition of the clusterincludes information regarding a machine configuration desired to beused by the cluster.
 19. The method of claim 17, wherein the scalingevent comprises triggering of a scale out action of a rule of theautoscaling policy.
 20. The method of claim 17, further comprisingcausing, by the processing resource, the customer to be provided withflexible capacity by using cloud-bursting or workload shifting whendemand associated with the cluster exceeds a capacity of the privatecloud.